home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
SGI Developer Toolbox 6.1
/
SGI Developer Toolbox 6.1 - Disc 4.iso
/
public
/
rsynth
/
src
/
README_klatt
< prev
next >
Wrap
Text File
|
1994-08-01
|
6KB
|
219 lines
Klatt Cascade / Parallel Formant Synthesizer
--------------------------------------------
History
-------
This file contains a version of the Klatt Cascade / Parallel Formant
Speech Synthesizer. The software for this synthesizer was originally
described in (1) and an updated version of the software
was described in (2). An up to date version of the software
synthesizer as described in (2) is commercially available from
Sensimetrics. (3)
The code contained within this directory is a translation of the
original Fortran, into C, by Dennis Klatt. In terms of the two
articles referred to above, this version is the mid point of the
development between the two systems described.
Modifications
-------------
The main part of the code in this directory was posted to comp.speech
in early 1993 as part of a crude text to speech conversion system. The
code taken from comp.speech seemed to have been modified considerably
from the original, and for use of the synthesizer in research it was
necessary to "fix" the changes that had been made. The major changes
I have made are:
1. Re-introduced the parallel-only / cascade-parallel switch. This
allows choice of synthesis method, either using both branches, or just
using the parallel branch.
2. Correct use of bandwidth parameters. One of the cascade bandwidth
parameters was being wrongly used in the parallel branch of the
synthesizer.
3. Correct operation of natural voicing source. The amplitude of the
natural voicing source was very much smaller than the amplitude of the
impulse source, making it difficult to swap between them to evaluate
the differences.
4. Removed the software synthesizer from the context of a text to
speech system. The synthesizer is now a stand-alone program, accepting
input as a set of parameters from a file, and allowing output to a
file or to stdout.
5. Added command line options to control the parameters that remain
constant during synthesis.
6. Added F0 flutter control, as described in (2).
Input File Format
-----------------
The input file consists of a series of parameter frames. Each frame of
parameters (usually) represents 10ms of audio output. The parameters
in each frame are described below. To avoid confusion, note that the
cascade and parallel branch of the synthesizer duplicate some of the
control parameters.
f0 This is the fundamental frequency (pitch) of the utterance
in this case it is specified in steps of 0.1 Hz, hence 100Hz
will be represented by a value of 1000.
av Amplitude of voicing for the cascade branch of the
synthesizer in dB0. Range 0-80, value usually 60 for a vowel sound.
f1 First formant frequency in Hz.
b1 Cascade branch bandwidth of first formant in Hz.
f2 Second formant frequency in Hz.
b2 Cascade branch bandwidth of first formant in Hz.
f3 Third formant frequency in Hz.
b3 Cascade branch bandwidth of first formant in Hz.
f4 Fourth formant frequency in Hz.
b4 Cascade branch bandwidth of first formant in Hz.
f5 Fifth formant frequency in Hz.
b5 Cascade branch bandwidth of first formant in Hz.
f6 Sixth formant frequency in Hz.
b6 Cascade branch bandwidth of first formant in Hz.
fnz Frequency of the nasal pole-zero in Hz (cascade branch only)
bnz Bandwidth of the nasal pole-zero in Hz (cascade branch only)
fnp Frequency of the nasal pole in Hz (cascade branch only)
bnp Bandwidth of the nasal pole in Hz (cascade branch only)
ap Amplitude of aspiration.
kopen Open quotient of voicing waveform, range 0-60, usually 30.
aturb Amplitude of turbulence 0-80. A value of 40 is fairly useful.
tilt Spectral tilt
af Amplitude of frication in dB, range 0-80 (parallel branch)
skew Spectral Skew
a1 Amplitude of first formant in the parallel branch, in dB.
Range 0-80.
b1p Bandwidth of the first formant in the parallel branch, in Hz.
a2 Amplitude of parallel branch second formant.
b2p Bandwidth of parallel branch second formant.
a3 Amplitude of parallel branch third formant.
b3p Bandwidth of parallel branch third formant.
a4 Amplitude of parallel branch fourth formant.
b4p Bandwidth of parallel branch fourth formant.
a5 Amplitude of parallel branch fifth formant.
b5p Bandwidth of parallel branch fifth formant.
a6 Amplitude of parallel branch sixth formant.
b6p Bandwidth of parallel branch sixth formant.
anp
ab
avp Amplitude of voicing for the parallel branch
gain gain in dB's range 0-80, 50 is a useful value.
Command Line Options
--------------------
-h Displays a help message.
-i <filename> sets input filename.
-o <outfile> sets output filename.
If output filename not specified, stdout is used.
-q quiet - print no messages.
-t <n> select output waveform (RTFC !)
-c select cascade-parallel configuration.
Parallel only configuration is default.
-n <number> Number of formants in cascade branch.
Default is 5.
-s <n> set sample rate
Default is 10Khz.
-f <n> set number of milliseconds per frame.
Default is 10ms per frame
-v Specifies that the impulse voicing source is used.
Default is natural voicing
-F <percent> percentage of f0 flutter
Default is 0\n
References
----------
(1) @article{klatt1980,
AUTHOR = {Klatt,D.H.},
JOURNAL = {Journal of the Acoustic Society of America},
PAGES = {971--995},
TITLE = {Software for a cascade/parallel formant synthesizer},
VOLUME = {67},
NUMBER = {3},
MONTH = {March},
YEAR = 1980}
(2) @Article{klatt1990,
author = "Klatt,D.H. and Klatt, L.C.",
title = "Analysis, synthesis and perception of voice quality
variations among female and male talkers.",
journal = "Journal of the Acoustical Society of America",
year = "1990",
volume = "87",
number = "2",
pages = "820--857",
month = "February"}
(3) Dr. David Williams at
Sensimetrics Corporation,
64 Sidney Street,
Cambridge,
MA 02139.
Fax: (617) 225-0470
Tel: (617) 225-2442
e-mail sensimetrics@sens.com